The overall purposes of the weatherProject package is to empower users to be able to retrieve weather data utilizing Weather Underground (https://www.wunderground.com/) and subsequently visualize that data in order to observe differences between microclimates within the vacinity of a city.
In order to achieve that end state, there are various functions included with the package, which will have examples demonstrated in the following sections. Those functions do things like retrieve latitude/longitude for a city, get other cities within the vacinity of a city, retrieve personal weather stations (PWSs) for those cities, get weather data related to PWSs, and create both map graphics and standard plots based on that data. The mapping data is based on the RgooleMaps package, which may be outside the normal usage of our audience. As part of the weatherProject package, a new R6 class (Pws) has been created and included to contain one or more PWSs and related attributes.
In order to get the full usage of this package, as user does need to sign up for 2 different types of online keys (both of which are free) and are required parameters for many of the functiosn. First, a key to access the wunderground API via https://www.wunderground.com/weather/api. Second, in order to overcome some of the current limitations of the wunderground API, a username to geonames via http://www.geonames.org/login.
There are many constraints to the current, free version of the wunderground API such as only allowing 10 calls per minute or 500 per day. Within many of the functions we have wait times of 10 seconds entered between calls and ability to limit number of items being returned by one function before using those results in a call to another function.
However, one thing that was unanticipated is that due to the way wunderground returned that information within its limitations led to many overlapping PWSs for major metropolitan cities In other words, there are now many PWSs reporting within each city, so that unless you are dealing with a rural area, wunderground will actually cap the number of PWSs returned to you before you have any that are even a mile apart. As a result, the end state weather information and visualization of variation are uninteresing. Here is an example of a pin map using the wunderground standard returned PWSs for San Diego, CA and RgoogleMaps for visualization.
If you are familiar with San Diego, then it will be clear that the maximum PWSs that will be returned are all within one microclimate and will show very little variation.
To counter this a combination of google maps API via the included longlat function and the subsequent usage of those coordinates in the findNearbyCities function that utilizes the geonames function as in the following call. Here to shield the protected information, it uses the variable myusername to represent the geocode username. The distance is 100km. The total number of cities being returned is capped at 5, because from every step after this one, calls will be made to wunderground and the daily limits will come into play.
mycities <- findNearbyCities(
username = mygeousername
,coord = longlat('San Diego, CA')
,distance = '100'
,maxRows = '5'
)
print(mycities)
## [1] "CA/San_Diego" "CA/Coronado" "CA/National_City"
## [4] "CA/Chula_Vista" "CA/Lemon_Grove"
Now we have set up a vector of cities to subsequently go in and get a better distribution of PWSs from the wunderground data.
The next step to drive toward the end result of seeing weather differences between the microclimates is now to get related PWSs for the chosen cities. This can be done simply by calling findPWS as in the following example. In the example, again, we are going to limit the returned results to ensure that the calls to wunderground do not pile up. Due to the limit of 1 put in for each city chosen, a maximum of 5 PWSs can result in this example. Again, a key is being masked here by the use of mywukey. The returned structure is a tibble.
mypws <- findPWS(myKey=mywukey, nearbyCities=mycities, maxPerCity = 1)
head(mypws)
## Source: local data frame [5 x 11]
## Groups: citseq [5]
##
## neighborhood city state country id lat
## <chr> <chr> <chr> <chr> <chr> <dbl>
## 1 Core-Columbia San Diego CA US KCASANDI299 32.71674
## 2 San Diego Yacht Club San Diego CA US KCASANDI186 32.71794
## 3 National City CA US ME3219 32.67050
## 4 Imperial Beach CA US KCAIMPER2 32.58029
## 5 Mount Vernon Lemon Grove CA US KCALEMON8 32.72964
## # ... with 5 more variables: lon <dbl>, distance_km <dbl>,
## # distance_mi <dbl>, citseq <int>, pwsNo <int>
With this basic output, we can now actually proceed to utilize the R6 class, Pws, which is part of the package. Feeding it into the class has the benefit of some data validation, but moreover the ability to plot the data by using the function defined within the class. In the following examples, the data is pushed into the class,
pwsr6 <- Pws$new(
id = mypws$id
,lat = mypws$lat
,lon = mypws$lon
,neighborhood = mypws$neighborhood
,city = mypws$city
,state = mypws$state
,country = mypws$country
,distance_km = mypws$distance_km
,distance_mi = mypws$distance_mi
)
pwsr6$print()
## # A tibble: 5 × 9
## id lat lon neighborhood city state
## <chr> <dbl> <dbl> <chr> <chr> <chr>
## 1 KCASANDI299 32.71674 -117.1623 Core-Columbia San Diego CA
## 2 KCASANDI186 32.71794 -117.2295 San Diego Yacht Club San Diego CA
## 3 ME3219 32.67050 -117.1010 National City CA
## 4 KCAIMPER2 32.58029 -117.1302 Imperial Beach CA
## 5 KCALEMON8 32.72964 -117.0340 Mount Vernon Lemon Grove CA
## # ... with 3 more variables: country <chr>, distance_km <dbl>,
## # distance_mi <dbl>
Now it is plotted, first with a pin drops from before, the with the ids as labels (which is the default).
pwsr6$plot(plottype = 'p')
When compared to the plot at the beginning of this vignette, one can see that, even though we have limited the number of PWSs returned significantly, we have greater dispersion throughou the San Diego area.
Now we will tag the ids, by using the default plot options for the class.
pwsr6$plot()
This plot can then be used as a geographical reference in cases where we may later plot weather data without a map.
First we will query the history available by using a start and end date applied to a tibble extracted from our class using one of the functions. The nearbyStations parameter could just as easily have been provided by the original data set generated prior to moving into our class.
myweather <- queryHistory(
myKey = mywukey
,nearbyStations = pwsr6$getTibble()
,startDate = '2017-03-17'
,endDate = '2017-03-20'
)
## [1] "http://api.wunderground.com/api/84952e8be8be479d/history_20170317/q/pws:KCASANDI299.json"
## [1] "http://api.wunderground.com/api/84952e8be8be479d/history_20170318/q/pws:KCASANDI299.json"
## [1] "http://api.wunderground.com/api/84952e8be8be479d/history_20170319/q/pws:KCASANDI299.json"
## [1] "http://api.wunderground.com/api/84952e8be8be479d/history_20170320/q/pws:KCASANDI299.json"
## [1] "http://api.wunderground.com/api/84952e8be8be479d/history_20170317/q/pws:KCASANDI186.json"
## [1] "http://api.wunderground.com/api/84952e8be8be479d/history_20170318/q/pws:KCASANDI186.json"
## [1] "http://api.wunderground.com/api/84952e8be8be479d/history_20170319/q/pws:KCASANDI186.json"
## [1] "http://api.wunderground.com/api/84952e8be8be479d/history_20170320/q/pws:KCASANDI186.json"
## [1] "http://api.wunderground.com/api/84952e8be8be479d/history_20170317/q/pws:ME3219.json"
## [1] "http://api.wunderground.com/api/84952e8be8be479d/history_20170318/q/pws:ME3219.json"
## [1] "http://api.wunderground.com/api/84952e8be8be479d/history_20170319/q/pws:ME3219.json"
## [1] "http://api.wunderground.com/api/84952e8be8be479d/history_20170320/q/pws:ME3219.json"
## [1] "http://api.wunderground.com/api/84952e8be8be479d/history_20170317/q/pws:KCAIMPER2.json"
## [1] "http://api.wunderground.com/api/84952e8be8be479d/history_20170318/q/pws:KCAIMPER2.json"
## [1] "http://api.wunderground.com/api/84952e8be8be479d/history_20170319/q/pws:KCAIMPER2.json"
## [1] "http://api.wunderground.com/api/84952e8be8be479d/history_20170320/q/pws:KCAIMPER2.json"
## [1] "http://api.wunderground.com/api/84952e8be8be479d/history_20170317/q/pws:KCALEMON8.json"
## [1] "http://api.wunderground.com/api/84952e8be8be479d/history_20170318/q/pws:KCALEMON8.json"
## [1] "http://api.wunderground.com/api/84952e8be8be479d/history_20170319/q/pws:KCALEMON8.json"
## [1] "http://api.wunderground.com/api/84952e8be8be479d/history_20170320/q/pws:KCALEMON8.json"
Now we will actually use the standard options for plotWeather. This will result in a map showing the average temperature (tempi) in English units, i.e. Faherenheit across the entire time period.
plotWeather(
wtbl = myweather
)
We only see a few degress of separation in the average. However, given the differences in the terrain of the San Diego area, let’s instead look at the range across the period, being the max - min.
plotWeather(
wtbl = myweather
,aggtype = "range"
)
Here we see much more differentiation, as the area most near the Pacific Ocean shows the least amount of variability, given the regulation of temperature that the massive body of water provides.
Now let us see if we can see any differentiation by looking at the minimum humidity across the time period.
plotWeather(
wtbl = myweather
,wvar = "hum"
,aggtype = "min"
)
What is most telling here is the very low mimimum % humidity of 39 in the inland area, which is neither bordering the ocean, nor in the mountains, and is known to be desert-like.
The plotWeather function also provides the ability to see a traditional line graph of conditions across time, rather than just a singular aggregate value. The following funcion call will look.
plotWeather(
wtbl = myweather
,plottype = "g"
)
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?